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Abstract 

The maximum observable correlation between the two components of a bipartite quantum system 
is a property of the joint density operator, and is achieved by making particular measurements on 
the respective components. For pure states it corresponds to making measurements diagonal in 
a corresponding Schmidt basis. More generally, it is shown that the maximum correlation may 
be characterised in terms of a 'correlation basis' for the joint density operator, which defines the 
corresponding (nondegenerate) optimal measurements. The maximum coincidence rate for spin 
measurements on two-qubit systems is determined to be (1 + s)/2, where s is the spectral norm 
of the spin correlation matrix, and upper bounds are obtained for n-valued measurements on 
general bipartite systems. It is shown that the maximum coincidence rate is never greater than the 
computable cross norm measure of entanglement, and a much tighter upper bound is conjectured. 
Connections with optimal state discrimination and entanglement bounds are briefly discussed. 

PACS numbers: 03.65.Ta, 03.67.-a 
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I. INTRODUCTION 



Suppose that two observers, Alice and Bob, have access to the respective components 
of a bipartite quantum system. If the observers make measurements of observables A and 
B respectively, the correlation between the measurement outcomes will clearly depend on 
A and B. It is therefore of interest to ask what choice of A and B will give the maximum 
possible correlation. The answer would allow bipartite states to be ranked in terms of their 
joint-correlation properties. It is also relevant to the efficient generation of secure keys in 
quantum cryptography where, all other things being equal, Alice and Bob should aim to 
compare measurement outcomes which are maximally correlated for a given shared state 
[1, 2]. 

It is important to make a distinction here between trivial and non-trivial correlations. 
For example, if Alice and Bob each simply measure the unit operator, their results will of 
course be perfectly (but trivially) correlated. Hence the answer to the above question is 
only of interest if it can be ensured that the measurement outcomes for each component 
have some useful degree of randomness. This is critical, for example, if Alice and Bob wish 
to generate a secure cryptographic key [2]. As will be shown, a natural approach is to 
require that the measured observables are 'maximally informative' or 'nondegenerate'. This 
is equivalent to requiring the observables to be described by maximal probability operator 
measures (POMs), i.e., A = {\dj)(aj\}, B = {\bk){bk\}- It turns out that this requirement is 
in fact naturally built into some measures of correlation (eg, the mutual information), while 
it must be imposed explicitly for others (eg, the coincidence rate). 

For the case of a pure bipartite state, \ip)(ip\, there is an intuitively obvious answer to 
the above question: Alice and Bob should choose A and B such that the kets {|aj)} and 
{\bj)} correspond to a Schmidt decomposition of i.e., such that 

N>> = Ev^k->®l 6 J>- (!) 

3 

Thus, each possible measurement outcome A = aj will be perfectly correlated with the 
corresponding measurement outcome B = by Note for this case that (aj\ak) = Sjk = (bj\bk)- 
Hence, the optimal observables are described by orthogonal POMs, and can be equivalently 
represented by the Hermitian operators A = and B = J2jbj\bj)(bj\ acting on 

the respective Hilbert space components [3, 4]. 
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More generally, when the bipartite state is described by some density operator p, finding 
the maximal POMs A = {\dj){aj\}, B = {\bk)(bk\} that maximise a given measure of 
correlation is quite difficult. Such a pair of maximally-correlated observables will determine 
a corresponding basis set, {|aj) ® |&fc)}, for the bipartite system. This basis set generalises 
the notion of the Schmidt basis for pure states, and may be called a correlation basis for p. 
Unlike the Schmidt basis, the correlation basis need not always be orthonormal. 

Mutual information and coincidence rate, as measures of correlation, are briefly discussed 
in Sec. II. Formal equations for the correlation basis are given in Sec. Ill, for the case of 
coincidence rate, and illustrated with examples in Sec. Ill C, including connections with the 
problem of optimal state discrimination. It is conjectured that at least one of the optimal 
observables A and B can always be chosen to correspond to an orthogonal POM. In Sec. IV, 
the maximum coincidence rate for two-valued measurements on pairs of qubits is explicitly 
determined as a simple function of the spectral norm of the 3x3 spin correlation matrix. 
This result is generalised in Sec. V, where general upper bounds for coincidence rate are 
obtained for n-valued measurements, based on a singular value decomposition of the Fano 
form of the density matrix [5, 6]. These bounds are related to the computable cross norm 
[7], and are generalised in Sec. VI to connect other linear correlation bounds (such as spin 
covariance) with entanglement properties. 

II. MUTUAL INFORMATION VS COINCIDENCE RATE 

To find the optimally correlated observables for a given bipartite system, it is necessary 
to first quantify joint correlation in some manner. Now, the statistics of any two observables 
A and B, measured on the respective components of the system, can always be described by 
corresponding probability operator measures (POMs) {Aj} and {B k } (i.e., sets of positive 
operators which sum to the unit operator [3, 4]), with the joint probability of measurement 
outcomes A = aj and B = b k for a bipartite density operator p being given by 

p jk = tr[pAj <g> B k \. 

Any measure of correlation will be some function of the probability distribution pj k , and 
two well known examples are discussed in the following. 
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First, the mutual information is defined by [8] 



J(4B|p):=5> fc log2^, (2) 

jk Pi k 

where pj and q k denote the marginal distributions for A and B respectively. This quantity 
vanishes for uncorrelated observables; is invariant under relabellings of measurement out- 
comes; and has a simple physical interpretation: if A and B are each measured for a large 
number of copies of p, then I(A,B\p) is the average amount of data gained per measure- 
ment outcome of A, about the corresponding sequence of measurement outcomes of B (as 
quantified by the number of bits required to represent the data), and vice versa [8]. 

The convexity of mutual information implies that the maximum mutual information for a 
given state p (also called the accessible information), can always be achieved via observables 
described by maximal PO Ms [9], i.e., with Aj = \aj)(aj\, B k = \bk){bk\- Thus, 

I ma x(p) ■= maxI(A,B\p) = max I(A,B\p). (3) 

A,B A,B maximal 

While it is very difficult to determine the optimal observables A and B in Eq. (3), a useful 
upper bound follows from application of the Holevo bound to the ensemble of states induced 
on one component of the bipartite system by a measurement on the other component (see 
Eqs. (12) of Ref. [10]): 

Imax(p) < min{S'(pi),S'(p 2 )}. (4) 

Here S(-) denotes the von Neumann entropy, and p\ and p 2 are the reduced operators for 
the first and second components of the bipartite system. This bound is sufficiently strong to 
obtain the maximum mutual information for any mixture p = J2 a ^alV'a) (^a| of pure states 
sharing a common Schmidt basis up to trivial phase factors, i.e., with 

\^oc) = E \[p 1 f exv[i<t> ( j\ \ a i) ® \ h j)- 
j 

In particular, Eq. (4) is saturated by choosing A and B to be the maximal POMs generated 
by this Schmidt basis, yielding 

Imax(^KWa)(i>a\) = ~ E P 3 lo S2 P j , (5) 

a j 

where Pj := J2 a ^aP^ ■ Note that for a pure state, the maximally correlated observ- 

ables are therefore those corresponding to a Schmidt basis for \ip), justifying the intuitive 
answer given in the Introduction. 



Second, the coincidence rate measure of correlation is defined by 

C(A,B\p):=J2Pn- (6) 

3 

This quantity is simply the probability of the observers obtaining matched outcomes, and 
reaches a maximum of unity only when the outcomes are perfectly correlated (i.e., pjk = 
PjSjk). It is also a little more tractable than mutual information, and will therefore be the 
focus of this paper. Note that coincidence rate (unlike mutual information) has no clear 
meaning for continuously- valued outcomes: the quantity C = j dxp xx is not invariant under 
relabellings of the outcomes (eg, for x — > \x one has C — > C/X ). Hence, only discretely- 
valued POMs will be considered in what follows. 

Unlike mutual information, the coincidence rate does not intrinsically distinguish between 
trivial and non-trivial correlations. For example, if Alice and Bob each merely measure the 
unit operator, they will obtain the maximum possible value of coincidence rate (unity), 
but the minimum possible value of mutual information (zero). Hence, as discussed in the 
Introduction, it is only of interest to maximise coincidence rate subject to some constraint 
that ensures a useful degree of randomness for the individual measurement outcomes. One 
reasonable constraint is the requirement that the measured observables are maximal POMs. 
This constraint is consistent with Eq. (3) for mutual information; does not allow the observers 
to remove potential information about correlations by merging measurement outcomes; and 
automatically rules out trivial correlations. The relevant problem of interest is then the 
determination of observables A and B which achieve the maximum value 

C max {p) := max C(A,B\p) = max ^{a,, bj\p\<ij, bj). (7) 

AM maximal AM maximal 

3 

In analogy to Eq. (5) for mutual information, one finds 

C ma x(Y, X c l \lpa){^a\) = P 3 = 
a j 

for mixtures of states sharing a common Schmidt basis, including all pure states. From 
Eq. (7) one also obtains the general convexity property 

C m ax (Ap + (1 - A)<t) < \C max (p) + (1 - \)C max (a), 

Hence, if two given observables A and B maximise the coincidence rate for some set of states, 
Sab, then this set is convex. Eq. (7) further implies that the maximum coincidence rate for 



any member of Sab is bounded above by the largest eigenvalue of the 'coincidence operator' 
Kab '■= J2j \ a j)( a j\ ® \bj){bj\, and hence that Sab contains a pure state if and only if this 
largest eigenvalue is unity. 

Finally, it may be recalled that mutual information and coincidence rate are both not 
only useful measures of correlation per se, but may also be used to differentiate 'classical' 
from 'quantum' correlations, via corresponding Bell inequalities. For example, if Alice can 
measure either of A and A, and Bob can measure either of B and B, and it is assumed that 
the statistics of these four observables can be generated by some classical joint probability 
distribution, then from Eq. (6.5) of Ref. [11] one has 

I(A, B\p) + I(A, B\p) + I (A, B\p) - I(A,B\p) < H(A) + H(B), 

where if (•) denotes the Shannon entropy, while from Eq. (8) of Ref. [12] one has 

C(A, B\p) + C(A, B\p) + C(A, B\p) - C(A, B\p) < 2. 

Each of these inequalities is violated, for example, by suitable spin measurements on a 
singlet state. The use of correlation measures to characterise the minimum degree of entan- 
glement present has been recently discussed in Ref. [13]. Connections between correlation 
and entanglement bounds are obtained in Sees. V and VI below. 

III. MAXIMISING COINCIDENCE RATE 
A. Conditions for extrema 

The linearity of coincidence rate with respect to A and B makes it straightforward to 
characterise the extremal observables, as per the following proposition. The conditions for 
such observables to maximise coincidence rate are less straightforward, however, and are left 
to the next subsection. 

Proposition 1: Necessary and sufficient conditions for maximal POMs A = {\o,j)(aj\} 
and B = {\bj){bj\} to attain an extremal value of coincidence rate, for bipartite state p, are 

(a k , bi\p\a h bi) = (a k , b k \p\a h b k ), (a k , k\p\a k , b k ) = (a h k\p\a h b k ) (8) 

for all k I. Moreover, these conditions are equivalent to the existence of Hermitian oper- 
ators V and W , acting on the first and second components respectively, satisfying 

(V - (bMbj)) \a 3 ) = 0, (W- ( % |pK)) |6,-> = (9) 
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for all j . The corresponding extremal value of coincidence rate is given by 

C(A,B\p)=tT 1 [V]=tr 2 [W]. (10) 

Proof. Consider the variational quantity 

J := £( % , h\p\a v bj) - tr^V (£ \a 3 )( aj \ - h ) ] - tr 2 [W (£ \bj){bj\ - i 2 ) ], 
j j j 

defined for arbitrary sets of kets { | a ^ ) } and of the same cardinality where V and 

W are Hermitian operators that act as Lagrange multipliers for enforcing the completeness 

constraints 

5>;><ail = ii> £1^1 = 12. (ii) 

3 3 

Clearly C ma x{p) in Eq. (7) corresponds to the global maximum of J under these constraints. 
Letting J(e) denote J evaluated under the variations \aj) — > \aj) + e\rrij), \bj) — > \bj) +e\rij), 
the extremal points of J correspond to the solutions of J'(0) = 0, i.e., 

5>i [(K><%| + h.c.)((b 3 \p\b,) ~ V)] + £tr 2 [(|n,)(6,| + h.c.){( aj \p\^) ~ W)] = 0. 

j j 

Choosing at most one element of the {\rrij), \rij)} to be non-vanishing (and arbitrary) then 
yields Eq. (9). Multiplying the latter on the left by (a k \ and (bk\ further yields 

{a k ,bj\p\aj,bj) = (a k \V\a,j), (aj,b k \p\aj,bj) = (b k \W\bj), 

and Eq. (8) immediately follows from the requirement that V and W are Hermitian. Mul- 
tiplying on the right of Eq. (9) by (a,] and (bj\, and summing over j, yields 

V = £<&>N W = £<°>k) \bj)(bi\. (12) 

3 3 

Taking these as defining relations conversely yields Eq. (9) from Eq. (8). The trace of 
Eq. (12) yields Eq. (10). □ 

Proposition 1 has a formal connection to the well known problem of distinguishing be- 
tween members of a given statistical ensemble. In particular, let {pf, Xj} denote the ensemble 
containing state pj with probability Xj. It is known that necessary and sufficient conditions 
for a POM {IT, } to optimally discriminate between members of this ensemble are [3] 

(T - Xjp^Uj = 0, T > \ jPj (13) 



for all j, for some Hermitian operator T. The first of these conditions is equivalent to 
Eq. (9) of Proposition 1, for the ensembles {<7j',Pj} and {tj; qj} defined by pjdj := (bj\p\bj) 
and qjTj := (bj\p\bj). Further, summing this first condition over j yields T = Y^jPj^j^-j, 
corresponding to Eq. (12). 

However, there is no simple analogue of the second condition in Eq. (13) - in particular, 
while the conditions 

V>{bj\p\bih W>{ aj \p\aj) (14) 

would immediately imply that A optimally discriminates between members of the ensemble 
{<7j-; Pj}, and that B optimally discriminates between members of the ensemble {t^;^}, 
these conditions are not sufficient to ensure a maximum for the coincidence rate, as will be 
shown by explicit example in Sec. Ill C. Indeed, it is not clear that these conditions are even 
necessary. 

Finally, some general properties of extremal observables are worth nothing. First, for 
pure states, the matrix coefficients in Eq. (8) vanish identically for k ^ I, for the case where 
observables A and B correspond to the Schmidt basis decomposition in Eq. (1), and hence 
these observables are extremal as expected. Second, Eq. (8) implies that if A and B are 
extremal for two density operators p and p', then they are extremal for any mixture of p and 
p' . Third, if p is invariant under some local unitary transformation, i.e., p = Ui®U2pU\ 
then, for a given solution A and B of Eq. (8), there will be a second solution A and B, with 
\a~j) = u\\dj) and \b 3 ) = U\\bj). A similar symmetry holds when p is invariant under the 
interchange of the two component systems. 

B. Maxima and n- valued measurements 

The second-order variation of the quantity J(e) appearing in the proof of Proposition 
1 immediately yields the condition J"(0) < for two extremal observables A and B to 
correspond to a local maximum of coincidence rate. This condition is required to hold only 
for all kets \m,j) and \rij) satisfying 

E (KX%I + l%>KI) = o = E (hX^I + \bj)(nj\) (is) 

3 3 

(corresponding to the completeness constraints in Eq. (11), to first order in e). However, 
the set of such kets is not straightforward to characterise explicitly, and is dependent on the 
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particular POMs A and B in question, making the condition difficult to verify in practice. In 
contrast, an explicit and generic condition for C(A, B\p) to be a local maximum is obtained 
in Proposition 2 below, based on the Naimark extension theorem. The restricted problem 
of maximising coincidence rate over n-valued measurements is also discussed. 

Attention will be limited to the case where p has finite support. In particular, if Hi 
and H 2 are defined to be the Hilbert spaces spanned by the eigenstates of the reduced 
density operators p\ := tr 2 [p], p 2 '■= tri[p], then it is assumed that these Hilbert spaces are 
finite-dimensional, i.e., 

d\ := dim(ifi) < oo, d 2 :— dim(if 2 ) < oo. (16) 

Now, consider a maximal POM A = on a d-dimensional Hilbert space H, having 

less than or equal to n non-zero elements (hence, from Eq. (11), n > d). The Naimark 
extension theorem then implies there is an n-dimensional Hilbert space H n containing H as 
a subspace, and a maximal orthogonal POM X = on H n (i.e., with (xj\xk) = Sjk), 

such that \a,j) = E\xj), where E denotes the ci-dimensional projection operator from H n to 
H [4, 14]. The converse result trivially holds: any maximal orthogonal POM on H n , with 
'eigenstates' generates a maximal POM A on H with at most n non-zero elements, 

defined via \aj) := E\xj). Since all o?-dimensional subspaces of H n are unitarily equivalent, 
this establishes the following Lemma: 

Lemma (Naimark extension theorem for maximal POMs): For a d- dimensional Hilbert 
space, H , the set of maximal POMs on H having at most n > d non-zero elements is 
characterised by the set of maximal orthogonal POMs on any n-dimensional Hilbert space 
H n that contains H as a subspace. 

It follows immediately, taking the limit n — > oo, that the class of all maximal POMs 
on H can be represented by the class of maximal orthogonal POMs on H^. Thus, the 
joint measurement of any two maximal POMs A and B, on the respective components of 
the tensor product H 1 <g> H 2 spanned by p, can be represented by the measurement of two 
maximal orthogonal POMs X and Y on the respective components of the tensor product 
Hoc <g> Hoc, with 

\a 3 )=E\x J ), \b k ) = F\y k ), (E <g> F)p = p = p(E <g> F), (17) 
where E and F denote the d\ and rf 2 -dimensional projections onto H 1 and H 2 respectively. 
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In particular, one has 

oo 

C(A,B\p) = C(X,Y\p) :='£(x jl y j \p\^Vs)- (18) 

j'=i 

The advantage of this representation is that maximal orthogonal POMs on Hoo are connected 
by unitary transformations. This allows one to explicitly write down the necessary and 
sufficient conditions for an extremal value of coincidence rate to be a local maximum, as per 
the following proposition. 

Proposition 2: Two maximal orthogonal POMs X and Y on H^, and hence the cor- 
responding maximal POMs A and B defined via Eq. (17), generate a local maximum of 
coincidence rate if and only if 

(x k , yi\p\x h yi) = (x k , y k \p\x h y k ), (x k , y t \p\x k , y k ) = (x h yi\p\x h y k ) (19) 

for all k 7^ I, and 

Y:{(xj\M(V- (b 3 \ P \b 3 ))M\ Xj ) + (y 3 \N(W - (aM^))N\y 3 ) +tr (p[M, \ Xj )( Xj \] ® [N, \y s )( yj \])} > 

(20) 

for all Hermitian operators M and N on Hoo, where V and W are defined as per Eq. (12). 

The proof is given in the Appendix. Note that the first condition is equivalent to Eq. (8) 
of Proposition 1 (and hence to Eq. (9) also), as an immediate consequence of Eq. (17). 
Further, the second condition is equivalent to the condition J"(0) < discussed above, if 
one defines \rrij) := iEM\xf) and \nj) := iFN\yj) (the constraints in Eq. (15) follow from 
the anti-Hermiticity of the operators iEME and iFNF). Note that the presence of the last 
term in Eq. (20) implies that the conditions in Eq. (14) are not sufficient to ensure a local 
maximum. Examples will be given in Sec. Ill C below. 

Proposition 2 applies to observables having an arbitrary number of possible outcomes. 
However, it is also of interest to consider the case where A and B are restricted to have 
a maximum of n possible outcomes, i.e., where the corresponding POMs have at most n 
non-zero elements. The completeness constraints in Eq. (11) imply that n > di,c?2- The 
maximum of the coincidence rate over such observables, for a given density operator p, will 
be denoted by C^ x (p). Noting the above Lemma, one has 

n 

CinL(p) = rn&xC(X n ,Y n \p) = max £ {xj,yj\p\xj,yj), (21) 
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where the maximum is over all maximal orthogonal POMs X n and Y n on H n . Clearly, 
Cmlx(p) i s a non-decreasing function of n, and converges to C ma x(p), i-e., defining d := 
max{o?i, d 2 }, 

C ( ± ( P ) < CtL (P) < Cffi (P) = Ono* (P) • (22) 

An explicit expression for C$ ax (p) is given in Sec. IV, and general upper bounds for C^l x (p) 
are obtained in Sec. V. 

Now, a maximal POM A with n elements may trivially be extended to an infinite number 
of elements by defining \aj) := for j > n. Hence, such n-valued POMs may be thought of 
as lying on the 'boundary' of the set of all maximal POMs. It would be of interest to show 
that C^l x (p), corresponding to the maximum of coincidence rate over a restricted portion 
of this boundary, is also (at the least) a local maximum of coincidence rate with respect 
to the full set of maximal POMs. The following corollary to Proposition 2 shows that the 
conditions in Eq. (14) are sufficient for this to be the case. 

Corollary: If the Hermitian operators V and W defined in Eq. (12) satisfy V > (bj\p\bj) 
and W > (aj\p\aj) for all j, for maximal POMs and achieving C^l x {p), then 
Cmlx(p) i s a l° ca l maximum of coincidence rate with respect to the set of all maximal POMs. 

Proof. By the above Lemma, maximal POMs with at most n non-zero elements can be 
represented by the set of maximal orthogonal POMs on H n . Further, since the group of 
unitary transformations U(n) x U(n) is compact, the global maximum of coincidence rate 
over such orthogonal POMs must be actually be achievable, by two orthogonal POMs X n 
and Y n on H n , having eigenstates . . . , \x n ) and . . . , \y n ) respectively. It may be 
shown, just as per the proof of Proposition 2, that these eigenstates must satisfy Eqs. (19) 
and (20) with the ranges of j, k, I restricted 1, 2, . . . , n (and with M and iV restricted to H n ). 
Further, any extension of X n and Y n to orthogonal POMs X and Y on Hoc must satisfy 
E\xj) = = F\xj) for all j > n. It follows for such X and Y that (i) Eq. (19) is trivially 
satisfied (implying the corresponding POMs A and B are extremal); (ii) the first and second 
terms of Eq. (20) are the same as for X n and Y n when j < n, and nonnegative when j > n 
(as a consequence of the premise of the Corollary); and (iii) the third term in Eq. (20) is the 
same as for X n and Y n when j < n, and vanishes when j > n. Hence, from Proposition 2, 
Cmlx(p) i s a local maximum of coincidence rate with respect to POMs having an arbitrary 
number of elements. □ 
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Note from the above proof that the maximal POMs and fit"' achieving C^ ax (p) must 
satisfy Eq. (8), with k and I restricted to the range 1, 2, . . . , n. It may be checked (noting that 
p is Hermitian) that this places 2n(n — 1) real constraints on the elements of A^ and B^ n \ 
which are invariant under the n\ permutations of the elements that preserve the condition 
k 7^ I. On the other hand, to specify two arbitrary maximal POMs, each having no more 
than n non-zero elements, requires 2n(n — 1) real parameters (corresponding to specifying 
the unitary transformations \xj) = Ux\zj), \yj) = Uy\zj) on H n relative to some fixed 
ortho normal basis {\z)}, up to arbitary phases), with (n!) 2 possible orderings of the elements 
(i.e., n\ orderings for each POM). It is therefore expected, for a generic density operator p, 
that there are n\ pairs of extremal candidates for A^ and (for density operators having 
particular symmetries, there will be further extrema, as per the last paragraph of Sec. Ill A). 
However, it is conjectured in the next subsection that C^ x (p) is in fact independent of n, 
which is equivalent to equality throughout in Eq. (22). If true, this means that no more 
than d\ candidates for the optimal observables need be checked in the generic case. 

C. Two examples and one conjecture 

As a first example, we will consider the case of 'trine' measurements on a two-qubit sys- 
tem, for which the measurement on each qubit optimally distinguishes between the states of 
the ensemble prepared by the measurement on the other qubit, and vice versa. Surprisingly, 
these measurements do not generate a global (or even a local) maximum of coincidence rate. 

In particular, let {|1), |2)} be a basis set for either qubit, and consider the 3-valued 'trine' 
observables A = B = where the normalised kets 

I0i):=|l), l&>:=^(|l> + V3|2>), \cf> 2 ) := \ (|1> - V3 |2>) 

form the vertices of an equilateral triangle in the Bloch representation. For the pure bipartite 

state p = \ifj){ip\, with 

m :=^(|1>®|1> + |2)®|2», 

it is then easily checked that 

1 

(WK> = -\(t>j){(t>j\ = (bj\p\bj) 

on the respective components. It follows that the operators V and W defined in Eq. (12) 
are each equal to |l, implying from Proposition 1 that A and B generate an extremal value 
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of coincidence rate, given by 

C(A,B\p) =tr[V] = 2/3. 

Further, the conditions in Eqs. (14) are trivially satisfied for this example, implying 
via Eq. (13) that A optimally distinguishes between members of the ensemble of states 
{\<t>j){4>j\'i §} prepared by measurement of B, and vice versa (see also Sec. IV. 1(a) of Ref. [3]). 

However, A and B above do not generate a global maximum of coincidence rate for state 
p, as the maximum possible value of unity may be achieved by instead choosing POMs with 
elements diagonal with respect to any Schmidt decomposition of Indeed, A and B 

above do not even generate a local maximum of coincidence rate - the extremal value of 2/3 
in fact corresponds to a saddle point. To see this, note first that the optimal distinguishing 
property implies that varying either A or B (while keeping the other fixed) must decrease the 
coincidence rate. Hence, the extremal value of 2/3 represents a maximum with respect to 
such variations. On the other hand, consider the one-parameter 'mirror-symmetric' family 
of observables A^ = B^ = {/j(a)|^ a) )(^ a) |}, with < a < 1, h(a) = 1 - a, / 2>3 (a) = 
(l + a)/2, and [15] 

l0i Q) >:=|l>, |0g) = (l + «)- 1/2 (v^|l)±|2)). 

Choosing a — 1/3 corresponds to the trine observables. It is straightforward to calculate 

C(A^,B^\p) = 2 - + 3 -(a-l)\ 

and hence the extremal value of 2/3 represents a minimum of coincidence rate with respect 

to the variation of a [16]. 

As an example of Proposition 2, consider now a separable state of the form 

d 

P = Y, X j\^)(^\®\Xj)(Xj\, (23) 
j'=i 

where the mutual orthogonality property {ipj\ipk) = $jk is satisfied, and each \xj) is arbitrary. 
Let A be the maximal orthogonal POM defined by [%) := \ipj), i.e., the optimal POM for 
distinguishing members of the ensemble {\ipj){ipj\;^j}; and let B be the maximal POM 
which optimally distinguishes between members of the pure-state ensemble {|Xj)(Xj|; A^} 
(the existence of such a maximal POM B follows from Theorem 2 of Ref. [17]). Thus, from 
Eqs. (12) and (13), 

(V - A^-X^l) \aj) = = (W - AjlttXfc-l) |6,->, V > X^}(^\, W > Xj^ixjl 
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It is then straightforward to check that both conditions of Proposition 2 are satisfied by any 
X and Y corresponding to A and B respectively (in particular, the third term in Eq. (20) 
vanishes identically, since the orthogonality of the elements of A implies that p and \xj)(xj\ 
must commute). Hence this choice of A and B generates a local maximum of coincidence 
rate. Indeed, since a measurement outcome A = a,- for the first component is perfectly 
correlated with preparation of state \xj) for the second component, and since B is the best 
possible measurement for distinguishing between such prepared states, the above choice of 
A and B is intuitively expected to generate a global maximum of coincidence rate. 

Note that if d = 2 in Eq. (23), then B is the orthogonal POM generated by the eigenstates 
of [3] 

v '■= M\xi)(xi\ - A 2 |x2)(x2|- 

The corresponding maximum value of coincidence rate follows as (cf. Eq. (2.34) in Chap. IV 
of Ref. [3]) 

1/21 



C(A,B\p) = -(l + tr[\r ] \]) = - 



l + (l-4A 1 A 2 |(xi|x 2 )| 2 ) 



(24) 



This result is significantly generalised in Sec. IV. 

Finally, note that in the above example that A is an orthogonal POM, having the min- 
imum possible number, n — d, of non-zero elements. We conjecture that this may be an 
instance of a general rule. As motivation, observe that if Alice and Bob each measure ob- 
servables having n > d possible outcomes, then the outcomes will typically have a greater 
degree of randomness when n > d. For example, the entropy H(A) of a maximal POM A is 
bounded below by 

H(A) = - VpjlogaPj > - log 2 maxpj > - log 2 max(a i |a i ), 
j 

which is nontrivial if A is non-orthogonal (i.e., if n > d). Similarly, the joint entropy of A 
and B is bounded below by 

H(AB) > — log 2 max{aj\aj) (bk\bk). 

Further, the more random a distribution is, the more spread out it is over the set of possible 
outcomes. Hence, the sum over the diagonal elements of the joint distribution pj k (i.e., 
the coincidence rate), will typically be smaller. It follows that choosing n > d is typically 
expected to have a decreasing effect on the maximum achievable coincidence rate: 
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Conjecture: The global maximum of coincidence rate, for a bipartite density operator 
with finite support, can always be achieved by observables A and B having at most d = 
max{di, d 2 } possible outcomes, where d 1 and d 2 are the Hilbert space dimensions defined in 
Eq. (16). 

Note that the conjecture implies at least one of A and B corresponds to an orthogonal 
POM, depending on whether d = d\ and/or d = d 2 . Note further that the conjecture 
corresponds to the case of equality throughout in Eq. (22), i.e., to the condition 

C max { P ) = Cil(p). (25) 

This conjecture is consistent with the convexity properties discussed in Sec. II and, if true, 
would greatly simplify the numerical determination of the maximum coincidence rate, as only 
POMs with d elements would need to be considered. Partial numerical support has been 
found for the conjecture, for the case of two-qubit systems. In particular, the evaluation 
of coincidence rate for ps 10 11 pairs of maximal POMs having no more than 3 non-zero 
elements, for each member of a random sample of 1200 bipartite density operators, indicates 
that C± = C£l. 

IV. MAXIMUM SPIN CORRELATION FOR TWO QUBITS 

An exact result for two-qubit systems is derived here, which also introduces the basic 
method used in the following section to derive general upper bounds for the coincidence 
rate. 

A system of two qubits is described by a density operator p on H 2 <8> H 2 , so that d\ = 
d 2 = d = 2. Consider the problem of finding the maximal too-valued POMs A and B which 
maximise the coincidence rate. Such POMs are necessarily orthogonal, corresponding to 
the measurement of spin in some direction, and hence, noting Eq. (21), the corresponding 
coincidence rate can be written as 

C±(p) = m^C(a^.a,a^.b\p), (26) 

a,b 

where a and b are unit directions. Note that C^ ax (p) is in fact equal to the global maximum 
of coincidence rate, C max (p), if the conjecture in Eq. (25) is correct. 

To determine C$ ax (p), let \m) denote the +1 eigenstate of a • m for unit direction m, so 
that \m)(m\ — (1 + a ■ m)/2. Hence, A = {|a)(a|, | — a)(— a|}, B = | — b)(—b\}, and 
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the coincidence rate follows via Eq. (6) as 

C(A, B\p) = hi[p(l + a« • a <g> a® ■ b)\ = \{l + a T Sb), 

where S is the 3x3 'spin correlation' matrix defined by 

S jk :=(vV®*l?>). (27) 

Note that S is real, but in general is not symmetric. 

Now, the singular value decomposition theorem [18] states that any real p x q matrix S 
can be put in the form 

S = R1DR2, (28) 

where R\ and R 2 are real orthogonal matrices (of dimensions p x p and q x q respectively), 
and D is a real p x q matrix of the form 

D jk = sjSjk, si > s 2 > ■ ■ ■ > 0. 

The numbers Sj are called the singular values of S, and are just the square roots of the 
eigenvalues of each of S T S and SS T , while Ri and R 2 are formed by the respective eigen- 
vectors of S T S and SS T [18]. The largest singular value, si, is also known as the spectral 
norm of S. 

It follows in particular, defining u = Rja and v = R 2 b, and using the Schwarz inequality, 
that for unit vectors a and b one has 

max a T Sb = m&x\u T Dv\ 

a,b u,v 



= max 

u.v 



j 

< maxE*i(«i) 2 ] 1/a E**(«*) 2 ] 1/2 

= max^Sj^) 2 < (max Sj ) ^(mj) 2 = si, 



with equality obtained for the choice u — v — x :— (1,0,0). Thus, 

C£L(p) = ^(l + ai), (29) 

where Si is the spectral norm of the spin-correlation matrix S 1 defined in Eq. (27) (hence one 
must have s\ — 1 for all pure states), with this maximum coincidence rate being achieved 
via spin measurements in the directions 

a = Rix, b = R 2 x. 
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The case of spin measurements on two qubits is thus completely solved. 
As a simple example, consider the separable state 

p = X 1 \z){z\ ® Ti + A 2 | - z)(—z\ <g) r 2 

for arbitary qubit density operators T\ and r 2 . One finds that all elements of the spin- 
correlation matrix vanish other than the third row, which is given by the 3-vector r with 
components 

r k := Aitr[n(rf } ] - A 2 tr[r 2 (xf } ]. 
It follows that only the 33-component of SS T is non-zero, and equal to r • r, yielding 

C±(p) = (l + \r\)/2. 

This result generalises Eq. (24) of the previous section, and greatly simplifies calculation 
of the corresponding coincidence rate, as it does not require explicit diagonalisation of the 
operator 77. Note that the coincidence rate is equal to the average probability for optimally 
discriminating between members of the ensemble {rj; A.,} [3]. 
As a second example, consider the isotropic state [19] 

p w = w\^ )(* | + — — 1 T , 

where denotes the singlet state, 1 T — 1 — \^f~){^~\ denotes the unit operator on the 
triplet subspace, and < w < 1. This state is rotationally-invariant, and the spin-correlation 
matrix is easily calculated to be Sjk = — (4w — l)5jk/S. It follows immediately that 

with the maximum coincidence rate being achieved by the choice a = b for < w < 1/4, 
and a — — b for 1/4 < w < 1. 

Finally, for a general factorisable state p = p\ ® p 2 , with p 1 = (1 + m ■ <r)/2 and p 2 = 
(1 + n ■ o")/2, the spin correlation matrix is just the outer product S = mn T , so that 
SS T = (n • n)mm T , with eigenvalues (m ■ m){n • n), 0, and 0. It follows immediately from 
Eq. (29) that the maximum possible coincidence rate for two uncorrelated qubits is given by 

cll(pi®p 2 ) = ^(i + H M), 

achieved by the choice of the measurement directions a = m and b = n. 
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V. BOUNDS FOR COINCIDENCE RATE 



A. A general upper bound 

Here an upper bound is given for C^ x (p) in Eq. (21), i.e., for the maximum achievable 
coincidence rate when Alice and Bob are restricted to measurements of n-valued observables. 
This bound is tight for the case n — 2, reducing to Eq. (29) above. Conversely, taking the 
limit n — > oo gives a global upper bound for C max (p), which turns out to be equal to the 
computable cross norm of p [7]. Note that if the conjecture in Eq. (25) is correct, then 
taking n = d will give a much tighter bound in general for C max (p). 

First, it is well known that the traceless Hermitian operators on an n-dimensional Hilbert 
space H n form a real vector space of dimension n 2 — 1, with inner product (M, N) := tr [MAT] 
[20]. Hence, if {K p } and {L q } denote two orthonormal basis sets for this vector space, then 

tr[K p K q } = 5ij = tr[L p L q ], K p = ^R pq L q , (30) 

q 

for some orthogonal matrix R (i.e., RR T = I). It follows that the trace-free part of any 
operator Z on H n can be written as 

z _ t A^li = J2tr[ZK p ]K p = J2^[ZL q }L q , (31) 

n p q 

and that any bipartite density operator p on H n ® H n can be expressed as 

p=^ 1 i<S)i + ^2u p K p <S)i + ^2v q i<S)L q + ^2T pq K p <S)L q , (32) 
n p q p,q 

where 

u p := (K p ® l)/n, v q :=(l®L q )/n, T pq := (K p <g> L q ). (33) 

This is referred to as a Fano form for p [5, 6]. 

Now, using Eqs. (30)-(33), the coincidence rate for two maximal orthogonal POMs X n = 
{kjX^jl} an d Yn = { | Uj)(yj\} on H n simplifies to 

C(X n ,Y n \p) = l/n + Tr[TRW], 
where Tr denotes the matrix trace, and 

W P q : = ]T (xj | L p | ^ ){Vj\L q \yj). (34) 

i 
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Further, Eq. (7.4.14) of Ref. [18] implies, for any two real matrices T and W and orthogonal 
matrix R, that 

\Tr[TRW}\ <Y,s k (T)s k (W), 
k 

where s\(P) > s 2 (P) > ... denote the singular values of matrix P (see Sec. IV). Hence, 
noting Eq. (21), one has 

C%l(p)<l/n + Y,Sk(T)s k (W). (35) 

k 

To simplify this upper bound, note first that W can be written, in terms of the vectors 

fp ] ■= (xj\L P \xj), 9p ] ■■= (yj\Lp\yj), 
as the sum of outer products W = £j (g U) ) T . Using Eq. (31) one finds 

fV).fW = 6 jk -l/n = gU). g W, 

implying that 

W T W = ]T g U) (g U) ) T = (W T W) 2 , Tr[W T W] =n-l. 

j 

Thus, W T W is an (n — l)-dimensional projection matrix, implying that the non-zero singular 
values of W consist of precisely n — 1 Is. The above upper bound therefore reduces to 

ra-l 

Ca(p)<l/n + E* (36) 

k=l 

where the matrix T is defined in Eq. (33). 

The bound can be further simplified, via a judicious choice of the basis sets {K p } and 
{L q }. In particular, recall that p only has support on the subspace Hi ® H 2 of H n ® H n (see 
Sec. Ill B). The first (di) 2 — 1 elements of {Ki, K 2 , . . .} can therefore be chosen to form a 
basis set for the traceless operators on H ly and the first (d 2 ) 2 — 1 elements of {L ly L 2 , . . .} 
can similarly be chosen to form a basis set for the traceless operators on H 2 . Two further 
basis elements, relabelled as K and L for convenience, will be chosen to have the forms 

K := aiE - - E), L := a 2 F - (3 2 {l - F), 

where E and F denote the projections from H n to Hi and H 2 . The requirements tr[X ] = 
tr[L ] = and tr[(K ) 2 ] = tr[(L ) 2 ] = 1 imply that 

cti = (1/di — 1/n) 1 ^ 2 , Pi — aidi/ (n — di), 
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a 2 = (l/d 2 - l/n) 1/2 , f3 2 = 0£ 2 d 2 /(n - d 2 ). 

Since the remaining basis elements must be orthogonal to the above basis elements, they 
cannot contribute to the Fano form of p in Eq. (32). Hence, using Eq. (33), the only nonzero 
rows and columns of the matrix T are given by the (di) 2 x (rf 2 ) 2 -submatrix 



(K ®L ) (K (g) L q ) 
(K p (g)L ) (K p (g)L q ) 



a x a 2 a 1 (l 1 ®L q ) 
a 2 (K p ®U) (K p ® L q ) 



(37) 



where li and i 2 denote the identity operators on Hi and H 2 respectively, and 1 < p < 
{dxf - 1, 1 < q < (d 2 ) 2 - 1. Substitution into Eq. (36) yields the main result of this section: 
Theorem: The maximum coincidence rate obtainable for a bipartite state with finite 
support, via maximal POMs having no more than n nonzero elements, is bounded above by 

min{n— 1,5 2 } 

CSL(p)<Vn+ E (38) 
fc=i 

where 5 := min{<ii, d 2 }, and the matrix is defined in Eq. (37). 

Since local unitary transformations correspond to left and right multiplication of by 
orthogonal matrices, which leave the singular values unchanged [18], this upper bound is 
invariant under such transformations. 

For the case of two qubits, with n — d± — d 2 — 2, one may choose K p = a^/\/2 and 
L q = <jW /y/2. The 'zeroth' row and column of vanish for this case, since a\ = a 2 = 0, 
leaving a 3 x 3-submatrix equal to one-half of the spin-correlation matrix S in Eq. (27). 
Thus, for this case, the upper bound of the theorem reduces to (1 + s 1 (S))/2, which can 
in fact always be achieved, as per Eq. (29) of the previous section. However, for n > 3 the 
upper bound in Eq. (38) cannot always be attained, essentially because the set of orthogonal 
matrices R in Eq. (30) is larger than the set of unitary transformations on H n [20]. 



B. Examples 



Note first that taking the limit n — > oo in Eq. (38) yields a global upper bound for the 
coincidence rate, independent of the possible number of measurement outcomes: 



s 2 



C max {p) < £ Sfc (T<~>) = Tr /(TH)^( 



k=i 



y(°°) 



Tr 



(39) 
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Thus, the upper bound is just the trace norm of T(°°). Noting that a x -> \j^fd[ and 
«2 — > l/y/da in this limit, it follows that the coefficients of T^ 00 ) yield a Fano form for p on 
Hi® H 2 , via 

P ^T^n i ®i 2 /^d^ 2 + ^T^ ) K p ^i 2 /^ 2 + ^T^i 1 ®Lj^ 1 + £ T^K p ®L q . 

p>l q>l Pi9>1 

(40) 

The trace norm of T(°°) may therefore be recognised as the 'computable cross norm' measure 
of quantum entanglement [6, 7], i.e., the maximum possible coincidence rate, C max (p), is 
bounded above by the computable cross norm. 

The computable cross norm cannot be greater than unity for any separable states [6, 7], 
and hence the upper bound in Eq. (39) is always nontrivial for separable states (and for 
a large proportion of entangled states). However, a stronger bound is postulated further 
below. 

Second, it is of interest to consider measurements restricted to the minimum number of 
possible measurement outcomes, i.e., with n = d. For this case ot\ = a 2 = 0, implying that 
the only nonvanishing part of is the (d\ — l) x (cZ| — 1) submatrix T pq = (K p <8> L q ) with 
P, Q > 1, yielding 

min{d-l,<5 2 -l} 

C%L(p)<l/d+ £ s k (f). (41) 

k=l 

As noted above, this bound is tight for the case d\ = d 2 = 2. For two-qudit systems it bounds 
the coincidence rate for the case of measurements described by orthogonal POMs. It may 
also be noted, in analogy to the computable cross norm above, that T has similarly been 
used in partial characterisations of entanglement [21, 22]. For example, the trace norm of 
T is never greater than [(1 — \/d\){\ — l/d 2 )] 1 ^ 2 for any separable state [21]. These general 
underlying connections, between bounds for correlations and measures of entanglement, 
would be an interesting subject for further investigation (see also Sec. VI). 

Third, a simple yet general example of the Theorem is provided by the Werner state for 
two qudits [19], which has the Fano form [21] 

1 - - x — lid 
p x :=-l®l + -^-J- £ K p ® K p , 

with — 1 < x < 1. It follows via Eqs. (32), (33) and (37) that is diagonal, and so, noting 
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that aitt 2 = 1/d — 1/n for this case, the Theorem yields 

c ,, (Pl) a +(fl _^ +max {^i,^=iMj, 

where D := mm{n — 1, d 2 }. Note that, in the limit n — > oo, the righthand side approaches 
the computable cross norm for Werner states, 1/d + \x — l/d\, as expected [7]. It is also 
straightforward to verify via direct calculation that this bound is tight for the case n = d 
and x > 1/d, i.e., 

Cl±(p x ) = l/d+\x-l/d\/(d+l) (42) 

for x > 1/d, achieved by the choice A = B. For x < 1/d, a modification of the Theorem 
for negative definite T gives the tight upper bound Clff ax (p x ) = 1/d + \x — l/d\/(d 2 — 1), 
achieved by maximal orthogonal POMs satisfying \aj) = |&p(j)) for any permutation P of 
1,2, ... ,d with P(j) 7^ j for all j. This example may be regarded as a generalisation of the 
d = 2 isotropic example in Sec. IV, where one identifies x with 1 — 2w. 

Note finally that if the conjecture in Eq. (25) is correct, then the bound in Eq. (41) is in 
fact an upper bound for C ma x(p), which is generally much tighter than the computable cross 
norm bound in Eq. (39). For example, consider any state for which the reduced density 
operators are maximally random, i.e., where p\ = l\/d\ and p 2 = W^2 (eg, the Werner 
state p x considered above). It then follows trivially via Eq. (37) that 



T(°°) ^ = (d 1 d 2 )- 1 / 2 + f ^>l/d + 



Tr 



Tr 



Thus, for such states, the bound in Eq. (41) is never greater than that in Eq. (39), and is 
generally smaller whenever d < 5 2 . 



VI. CONCLUSIONS 



It is well known that determining the maximum mutual information between the compo- 
nents of a given bipartite system is a difficult problem [9]. The results of this paper indicate 
that it is similarly not a straightforward matter to maximise the coincidence rate, despite 
(i) its linearity with respect to the density operator, and (ii) formal similarities with the 
well known problem of optimal state discrimination. A notable exception is the case of spin 
measurements on two-qubit systems, which has been fully solved in Sec. IV. More generally, 
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one only has available the formal equations for the correlation basis derived in Proposi- 
tions 1 and 2 of Sec. Ill, and the upper bounds for n-valued measurements derived in the 
Theorem of Sec. IV. These general results could be substantially strengthened if the Con- 
jecture of Sec. Ill C could be verified. It would further be of interest to determine whether 
or not the 'optimal discrimination' conditions in Eq. (14) must be satisfied by observables 
corresponding to a global maximum of coincidence rate. 

It is worth mentioning here some generalisations of the results in Sees. IV and V, to 
other linear measures of correlation. For example, note that the spin correlation matrix S 
in Eq. (27) is closely related to the spin covariance matrix S defined by 

z»~(of ) »<F)-(<F)i>F)- 

In particular, explicitly indicating dependence on the density operator, one has S(p) = 
S(p) — S(pi <E> P2)- This covariance matrix has been of recent interest in the characterisation 
of entanglement [13, 22]. For example, the main result in Sec. IV of Ref. [22] may be 
simplified to 

TrfS^] = 4tr[(p - Pl ® p 2 f] < 1, (43) 

for all separable two-qubit states, i.e., a separable state p can lie at distance of at most 1/2 
from pi <S> P2, (is measured by the Hilbert- Schmidt metric. 

Now, the covariance of two arbitrary spin observables, corresponding to directions a and 
b, may be written as 

Cov(A,B\p) = a T Sb. 
The methods of Sec. IV then immediately lead to the upper bound 

maxCov(A, B\p) = Sl (S) (44) 

a,b 

analogous to Eq. (29), i.e., the maximum possible spin covariance for state p is given by 
the spectral norm of the spin covariance matrix. Note that this bound is invariant under 
local unitary transformations. It follows, for example, that Theorem 1 of Ref. [13] may be 
strengthened to the observable-independent statement that 

tr[p 2 ] + ^iCS)<l (45) 

for all separable states of two-qubit systems. Noting Eq. (44), this inequality is also valid if 
Cov(A, B\p) is substituted for Si(S), for any spin observables A and B. Similarly, Eq. (24) 
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of Ref. [13] may be strengthened, using the methods of Sec. IV, to the entanglement bound 

E N {p) > max{0, log 2 [ Sl (S) + s 2 (S)}} (46) 

for the logarithmic negativity of a two-qubit system. Thus, as in Sec. V, correlation and 
entanglement bounds are seen to be closely related. 

Finally, consider some general linear measure of correlation, of the form 

G(A,B\p) := J2Pjk9jk = J29jk(a j ,b k \p\a j ,b k ). 

j,k jk 

Coincidence rate corresponds to the choice gj k = 8j k . The related 'covariance' measure 

G(A, B\p) := G(A, B\p) - G(A, B\ Pl <g> p 2 ) 

then has the desirable property of automatically vanishing for uncorrelated states. The 
methods of Sec. V A may then be applied to G, with p replaced by p — p\ ® p 2 , to yield the 
corresponding upper bound 

\G(A<- n \BW\p)\ = \Tt[TRW]\ < ^s fc (f) s k (W G ), (47) 

k 

analogous to Eq. (35), for maximal POMs A^ and B^ having n elements each. Here 

T pq : = {K p ®L q )-{K p ) (L q ), 
and the definition of W in Eq. (34) is generalised to 

W m ^JLdjkixjlLplxj) (y k \L q \y k ). 

This bound is tight for spin measurements on two-qubit systems. For the choice gj k = 5j k 
the bound simplifies to 

min{n— 1,5 2 — 1} 

CoTT(A (n \B^\p)\< ]T s k (T) (48) 

k=i 

for the 'correlation' Y,j(Pjj — PjQj) of any two n- valued maximal POMs (see Sec. II), gen- 
eralising Eq. (44) above. Note that one may simplify the calculation of the above bounds 
by choosing the basis elements as in Sec. V, allowing one to replace T by the submatrix 
corresponding to 1 < p < (g^) 2 — 1 and 1 < q < (d 2 ) 2 — 1. 
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APPENDIX A: PROOF OF PROPOSITION 2 



To prove Proposition 2 in Sec. Ill B, note first that all infinitesimal variations of the 
orthogonal POMs X and Y in Eq. (18) are generated by infinitesimal unitary transformations 
on Hqq, and hence are of the form 

\xj) -> exp(ieM)\xj), \y 3 ) -> exp(ieN)\yj) 

for arbitrary Hermitian operators M and N on H^, where e is a infinitesimal real parameter. 
Note from Eq. (18) that these variations are equivalent to keeping X and Y fixed and instead 
varying the density operator, viz. 

p — > p e := exp(—ieK)pexp(ieK), 

with K:=M®l + l®N. Expanding in powers of e gives 

p t = p-i[K,p\-{l/2)e 2 [K, [K,p]] + ... 

and hence the corresponding variation in coincidence rate is 

SC = -ieJ2(xj, Vj\[K, p\ \xj, y 3 ) - (l/2)e 2 ^{xj, y 3 \[K, [K, p]\\ Xj , %■) + .... 
j j 

Requiring the first-order variation to vanish yields 

o = tn(M ivMvi)] ) + EK» ) 

J 3 

for arbitrary M and N, where X 3 and lj denote \x 3 )(x 3 \ and respectively. Hence, 

each operator sum must vanish identically, and Eq. (19) follows as the matrix components 
of these sums, with respect to the X and Y basis sets respectively. 

Requiring the second-order variation to be no greater than zero, as is required for a local 
maximum, is equivalent to 

£{tri ([M, [M,X 3 ](y 3 \p\y 3 )) +tr 2 ([7V, [TV, Y 3 ] {xj\p\x 3 )) + 2 tr (p [M, X 3 \] ® [iV,Y}|])} > 0. 
i 

Now, defining the Hermitian operators 

^ : = Y^^Mvi) \ x j)( x 3\, w : = J2( x M x 3) !%■)<%■ I. 
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and using Eq. (17) and = 1, the summation over the first term may be simplified to 

give 

5>i ([M, [M,X,](y,\p\y,)) = £tn (MX s (y s \p\ yj )M + h.c. - 2X j M(y j \p\y j )M) 

3 3 

= J>! (X 3 M[V + V+]M - 2X,M( % -|p|y,)M) 
= 2^(x j \M(V-(b j \p\b j ))M\x j ). 

3 

The summation over the second term may be similarly simplified in terms of W. Equa- 
tion (20) then immediately follows if it can be shown that V = V and W = W. 

To do so, note first from Eqs. (12) and (17) that VE = V and WF = W. Together with 
their conjugates, these equations imply [V,E] = = [W,F], and hence that 

V = V + (1- E)V(1 — E), W = W + (1 - F)W{1 - F). 

Substitution into 

(v - ( yj \p\vi)) ki> = o, (w- (xM*i))\vi) = °> 

(which is equivalent to Eq. (19) precisely as per the equivalence of Eqs. (8) and (9) in 
Proposition 1), and using Eqs. (8) and (17), then gives 

(1 - E)V{1 - E)\ Xj ) = = (1 - F)W{\ - F)\ Vj ) 

for all j. But {| £_,■)} and {!%)} are basis sets for Hoq, implying the operators must vanish 
identically, and the desired result immediately follows. 
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